Normalized Relevance Distance - A Stable Metric for Computing Semantic Relatedness over Reference Corpora

نویسندگان

Christoph Schaefer

Daniel Hienert

Thomas Gottron

چکیده

We propose the Normalized Relevance Distance (NRD): a robust metric for computing semantic relatedness between terms. NRD makes use of a controlled reference corpus for a statistical analysis. The analysis is based on the relevance scores and joint occurrence of terms in documents. On the basis of established reference datasets, we demonstrate that NRD does not require sophisticated data tuning and is less dependent on the choice of the reference corpus than comparable approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

متن کامل

Computing Semantic Relatedness in German with Revised Information Content Metrics

The paper presents an application of information content based metrics to compute semantic relatedness of word senses in German. The main contributions are: an annotation study based on a revised definition of semantic relatedness beyond synonymy, an extension of Resnik’s (1995) procedure for computing information content of concepts for strongly inflected languages, an application of informati...

متن کامل

Czech Dataset for Semantic Similarity and Relatedness

This paper introduces a Czech dataset for semantic similarity and semantic relatedness. The dataset contains word pairs with hand annotated scores that indicate the semantic similarity and semantic relatedness of the words. The dataset contains 953 word pairs compiled from 9 different sources. It contains words and their contexts taken from real text corpora including extra examples when the wo...

متن کامل

Semantic Relatedness Estimation using the Layout Information of Wikipedia Articles

Computing the semantic relatedness between two words or phrases is an important problem in fields such as information retrieval and natural language processing. Explicit Semantic Analysis (ESA), a state-of-the-art approach to solve the problem uses word frequency to estimate relevance. Therefore, the relevance of words with low frequency cannot always be well estimated. To improve the relevance...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Normalized Relevance Distance - A Stable Metric for Computing Semantic Relatedness over Reference Corpora

نویسندگان

چکیده

منابع مشابه

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Computing Semantic Relatedness in German with Revised Information Content Metrics

Czech Dataset for Semantic Similarity and Relatedness

Semantic Relatedness Estimation using the Layout Information of Wikipedia Articles

عنوان ژورنال:

اشتراک گذاری